智能论文笔记

Unscented Kalman filter with stable embedding for simple, accurate and computationally efficient state estimation of systems on manifolds in Euclidean space

Jae-Hyeon Park , Dong Eui Chang

分类：机器人

2022-08-22

本文提出了一种简单，准确且计算上有效的方法，以将欧几里得空间中开发的普通无气体滤波器应用于在流形上发展的系统。我们使用称为稳定嵌入的数学理论来使无味的Kalman滤波器保持状态估计，以保持状态估计值在表现出色的估计性能的同时，与歧管近距离近距离。我们通过将其应用于卫星系统模型并将其与其他专门针对歧管上系统设计的非意识到的卡尔曼过滤器进行比较，确认了我们设计的过滤器的性能。我们设计的过滤器的估计误差很低，可以使状态估计与预期的歧管密切相邻，并消耗少量的计算时间。同样，我们设计的过滤器非常简单易用，因为我们的过滤器直接采用了在欧几里得空间中设计的现成的标准的无气味卡尔曼滤波器，而没有任何特定的歧管结构构造的离散方法或坐标转换。

translated by 谷歌翻译

A Survey for In-context Learning

Qingxiu Dong , Lei Li , Damai Dai , Ce Zheng , Zhiyong Wu , Baobao Chang , Xu Sun , Jingjing Xu , Lei Li , Zhifang Sui

分类：自然语言处理 | 人工智能

2022-12-31

With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few training examples. It has been a new trend exploring ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress, challenges, and future work in ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques of ICL, including training strategies, prompting strategies, and so on. Finally, we present the challenges of ICL and provide potential directions for further research. We hope our work can encourage more research on uncovering how ICL works and improving ICL in future work.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Fast DistilBERT on CPUs

Haihao Shen , Ofir Zafrir , Bo Dong , Hengyu Meng , Xinyu Ye , Zhe Wang , Yi Ding , Hanwen Chang , Guy Boudoukh , Moshe Wasserblat

分类：自然语言处理 | 人工智能 | 机器学习

2022-10-27

Transformer-based language models have become the standard approach to solving natural language processing tasks. However, industry adoption usually requires the maximum throughput to comply with certain latency constraints that prevents Transformer models from being used in production. To address this gap, model compression techniques such as quantization and pruning may be used to improve inference efficiency. However, these compression techniques require specialized software to apply and deploy at scale. In this work, we propose a new pipeline for creating and running Fast Transformer models on CPUs, utilizing hardware-aware pruning, knowledge distillation, quantization, and our own Transformer inference runtime engine with optimized kernels for sparse and quantized operators. We demonstrate the efficiency of our pipeline by creating a Fast DistilBERT model showing minimal accuracy loss on the question-answering SQuADv1.1 benchmark, and throughput results under typical production constraints and environments. Our results outperform existing state-of-the-art Neural Magic's DeepSparse runtime performance by up to 50% and up to 4.1x performance speedup over ONNX Runtime. Source code is publicly available at https://github.com/intel/intel-extension-for-transformers.

translated by 谷歌翻译

Boosting Semi-Supervised Semantic Segmentation with Probabilistic Representations

Haoyu Xie , Changqi Wang , Mingkai Zheng , Minjing Dong , Shan You , Chong Fu , Chang Xu

分类：计算机视觉

2022-10-26

Recent breakthroughs in semi-supervised semantic segmentation have been developed through contrastive learning. In prevalent pixel-wise contrastive learning solutions, the model maps pixels to deterministic representations and regularizes them in the latent space. However, there exist inaccurate pseudo-labels which map the ambiguous representations of pixels to the wrong classes due to the limited cognitive ability of the model. In this paper, we define pixel-wise representations from a new perspective of probability theory and propose a Probabilistic Representation Contrastive Learning (PRCL) framework that improves representation quality by taking its probability into consideration. Through modelling the mapping from pixels to representations as the probability via multivariate Gaussian distributions, we can tune the contribution of the ambiguous representations to tolerate the risk of inaccurate pseudo-labels. Furthermore, we define prototypes in the form of distributions, which indicates the confidence of a class, while the point prototype cannot. Moreover, we propose to regularize the distribution variance to enhance the reliability of representations. Taking advantage of these benefits, high-quality feature representations can be derived in the latent space, thereby the performance of semantic segmentation can be further improved. We conduct sufficient experiment to evaluate PRCL on Pascal VOC and CityScapes to demonstrate its superiority. The code is available at https://github.com/Haoyu-Xie/PRCL.

translated by 谷歌翻译

Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for Multi-Behavior Recommendation

Chang Meng , Ziqi Zhao , Wei Guo , Yingxue Zhang , Haolun Wu , Chen Gao , Dong Li , Xiu Li , Ruiming Tang

分类：人工智能

2022-08-03

在大多数现实世界中的推荐方案中，多种行为（例如，单击，添加到购物车，采购等）的多类型，这对于学习用户的多方面偏好是有益的。由于多种类型的行为明确表现出依赖性，因此有效地对复杂行为依赖性建模对于多行为预测至关重要。最先进的多行为模型以所有历史互动为输入都没有区别地学习行为依赖性。但是，不同的行为可能反映了用户偏好的不同方面，这意味着某些无关的互动可能会像预测目标行为的声音一样发挥作用。为了解决上述局限性，我们向多行为建议介绍了多功能学习。更具体地说，我们提出了一种新颖的粗到五个知识增强的多功能学习（CKML）框架，以学习不同行为的共享和特定于行为的利益。 CKML引入了两个高级模块，即粗粒兴趣提取（CIE）和细粒度的行为相关性（FBC），它们共同起作用以捕获细粒度的行为依赖性。 CIE使用知识感知信息来提取每个兴趣的初始表示。 FBC结合了动态路由方案，以在兴趣之间进一步分配每个行为。此外，我们使用自我注意机制在兴趣水平上将不同的行为信息相关联。三个现实世界数据集的经验结果验证了我们模型在利用多行为数据方面的有效性和效率。进一步的实验证明了每个模块的有效性以及多行为数据共享和特定建模范式的鲁棒性和优越性。

translated by 谷歌翻译

Adversarial Focal Loss: Asking Your Discriminator for Hard Examples

Chen Liu , Xiaomeng Dong , Michael Potter , Hsi-Ming Chang , Ravi Soni

分类：计算机视觉 | 机器学习

2022-07-15

焦点损失已获得了令人难以置信的知名度，因为它使用一种简单的技术来识别和利用硬性示例来在分类方面取得更好的性能。但是，此方法不容易在分类任务之外概括，例如在KePoint检测中。在本文中，我们提出了对焦点检测任务的焦点损失的新颖适应，称为对抗局灶性损失（AFL）。AFL不仅在语义上类似于焦点损失，而且还可以作为任意损失功能的插头升级。尽管焦点损失需要分类器的输出，但AFL利用单独的对抗网络来为每个输入产生难度分数。然后，即使在没有分类器的情况下，也可以将这种难度分数用于在硬示例上的学习优先级。在这项工作中，我们展示了AFL在增强关键点检测中现有方法的有效性，并验证其根据难度重新提交示例的能力。

translated by 谷歌翻译

Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks

Qingsong Lv , Ming Ding , Qiang Liu , Yuxiang Chen , Wenzheng Feng , Siming He , Chang Zhou , Jianguo Jiang , Yuxiao Dong , Jie Tang

分类：机器学习

2021-12-30

近年来，异构图形神经网络（HGNNS）一直在开花，但每个工作所使用的独特数据处理和评估设置会让他们的进步完全了解。在这项工作中，我们通过使用其官方代码，数据集，设置和超参数来展示12个最近的HGNN的系统再现，揭示了关于HGNN的进展的令人惊讶的结果。我们发现，由于设置不当，简单的均匀GNN，例如GCN和GAT在很大程度上低估了。具有适当输入的GAT通常可以匹配或优于各种场景的所有现有HGNN。为了促进稳健和可重复的HGNN研究，我们构建异构图形基准（HGB），由具有三个任务的11个不同数据集组成。 HGB标准化异构图数据分割，特征处理和性能评估的过程。最后，我们介绍了一个简单但非常强大的基线简单 - HGN - 这显着优于HGB上以前的所有模型 - 以加速未来HGNN的进步。

translated by 谷歌翻译

An Empirical Study of Adder Neural Networks for Object Detection

Xinghao Chen , Chang Xu , Minjing Dong , Chunjing Xu , Yunhe Wang

分类：计算机视觉

2021-12-27

Adder神经网络（Addernets）在图像分类上表现出令人印象深刻的性能，只有加法操作，比使用乘法建立的传统卷积神经网络更节能。与分类相比，对通过Addernets降低现代对象探测器的能耗的强烈需求，例如自主驾驶和面部检测。在本文中，我们提出了对物体检测的addernets的实证研究。我们首先揭示了预先训练的加法器骨架中的批量归一化统计，不应冻结，因为Addernets的相对较大的特征方差。此外，我们在颈部中插入更多的快捷方式连接，并设计一个新的特征融合架构，以避免加法器层的稀疏功能。我们展示了广泛的消融研究，探讨了加法器探测器的几种设计选择。与最先进的比较在Coco和Pascal VOC基准上进行。具体而言，所提出的加法器FCOS在Coco Val集上实现了37.8 \％AP，展示了卷积对应物的相当性能，具有约1.4倍的能量减少。

translated by 谷歌翻译

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

Boseop Kim , HyoungSeok Kim , Sang-Woo Lee , Gichang Lee , Donghyun Kwak , Dong Hyeon Jeon , Sunghyun Park , Sungju Kim , Seonhoon Kim , Dongpil Seo

分类：自然语言处理

2021-09-10

GPT-3显示了培训的大规模语言模型（LMS）的卓越情调学习能力，培训数十亿规模数据。在这里，我们解决了GPT-3纸张报告的一些剩余问题，例如非英语LM，不同大小模型的性能，以及最近引入的迅速优化对上下文学习的效果。为实现这一目标，我们介绍了HyperClova，一个韩国VPT-3的韩国变体训练在一个以韩国为中心的560b标准的令牌。通过我们的韩国特定标记化，HyperClova与我们的培训配置增强，显示了韩国各种下游任务的最先进的上下游零射击和几秒钟学习表演。此外，我们展示了基于及时的学习的性能优势，并演示如何集成到迅速的工程管道中。然后，我们讨论了通过引入Hyperclova Studio，互动提示工程界面向ML的非专家提供AI原型设计能力来实现No Code AI范例的可能性。最后，我们展示了我们具有三个成功的内部应用程序的方法的潜力。

translated by 谷歌翻译